0%

Tensorrt Engine 输出模型输入输出信息

调试plan文件的时候经常需要输出相关信息,在这里分享一个方便的程序脚本,可以输出每个输入输出节点的相关信息

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import tensorrt as trt
import sys
import numpy as np
trt_logger = trt.Logger(trt.Logger.INFO)
runtime = trt.Runtime(trt_logger)
with open(sys.argv[1], "rb") as f:
engine = runtime.deserialize_cuda_engine(f.read())
print("Engine Info:")
for i, binding in enumerate(engine):
shape = [engine.max_batch_size, *engine.get_binding_shape(binding)]
dtype = trt.nptype(engine.get_binding_dtype(binding))
volume = abs(trt.volume(engine.get_binding_shape(binding)))
if engine.binding_is_input(binding):
desc = "input"
else:
desc = "output"
print(f"{i} type: {desc}\n binding: {binding} \n data: {np.dtype(dtype).name}\n shape: {shape} => {volume} \n")

输入形式如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[TensorRT] INFO: [MemUsageChange] Init CUDA: CPU +329, GPU +0, now: CPU 341, GPU 999 (MiB)
[TensorRT] INFO: Loaded engine size: 4 MB
[TensorRT] INFO: [MemUsageSnapshot] deserializeCudaEngine begin: CPU 346 MiB, GPU 999 MiB
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +491, GPU +210, now: CPU 842, GPU 1211 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +287, GPU +200, now: CPU 1129, GPU 1411 (MiB)
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1129, GPU 1393 (MiB)
[TensorRT] INFO: [MemUsageSnapshot] deserializeCudaEngine end: CPU 1129 MiB, GPU 1393 MiB
Engine Info:
0 type: input
binding: data
data: float32
shape: [1, 1, 3, 256, 256] => 36864

1 type: output
binding: result
data: float32
shape: [1, 8, 8] => 8

使用

1
2
3
4
docker run -itd --gpus all \
--rm -v /home/user/trtdebug.py:/models/trtdebug.py \
--rm -v /home/user/model.plan:/models/model.plan \
geminihub.oa.com:80/karizhang/nvcr.io/nvidia/tensorrt:21.08-py3 python /models/trtdebug.py /models/model.plan